# Replication materials for "Electoral predictors of polling errors"
This folder contains the replication materials for the paper "Electoral Predictors of Polling Errors" by Sina Chen, John Körtner, Jens Wiederspohn, and Peter Selb.


## Abstract 

Case studies of polling failures focus on within-election differences in poll accuracy. The crucial question of why polls fail in one election but not in others often remains a matter of speculation. To develop a contextual understanding, we review and unify the- ories of election features suspected of encouraging polling errors, including mobilization, candidacies, polarization, and electoral conduct. We extend a Bayesian hierarchical modeling approach that separates poll bias and variance at the election level and links error components to electoral predictors. Investigating 6,375 pre-election polls across 318 U.S. Senate elections, 1990-2022, we find an overall trend toward smaller but more uniform errors. Poll variance exhibits a weak negative association with mobilization and polarization. Until 2004, frontrunners and incumbents were overestimated, but there is little evidence that polls are biased for female or minority candidates. Finally, Republicans in states with lower levels of state democracy are slightly underestimated in recent years.


## File Overview

### Data Folder 

- Place the data sei in this folder. The final dataset "us_senate_polls1990_2022_final.RDS" containing information on all relevant polls and covariates for the analysis may be available on request. For detailed descriptions of the variables, see the accompanying codebook "electoral_predictors_codebook.pdf".

### Code Folder 
This folder contains all code required to replicate the analyses.

#### 1. Stan Model Fitting (fit_stan folder)

The fit_stan subfolder contains code to fit the Stan models and save simulation results. Each file corresponds to a specific model.

#### Model fitting scripts:

- "fit_us_senate1990_2022_cf_distance.R": fits the CFscore distance model (seed: 1507710711)    
- "fit_us_senate1990_2022_cf_score_dem.R": fits the Dem. candidate CFscore model (seed: 1610186003)        
- "fit_us_senate1990_2022_cf_score_rep.R": fits the Rep.. candidate CFscore model (seed: 252758259)       
- "fit_us_senate1990_2022_control.R": fits the state control model (seed: 245804471)                 
- "fit_us_senate1990_2022_democracy.R": fits the State Democracy Score model (seed: 1004444140)              
- "fit_us_senate1990_2022_empty.R": fits the empty model  (seed: 379584849)                  
- "fit_us_senate1990_2022_empty30.R": fits  the empty model for polls conducted up to 30 days before election day (seed:69256394)               
- "fit_us_senate1990_2022_exp.R": fits the per capita campaign expenditure model  (seed: 1777942762)                   
- "fit_us_senate1990_2022_face_prob_white_dem.R": fits the model for the probability of Dem. candidates being white based on pictures (appendix, seed: 319253659)    
- "fit_us_senate1990_2022_face_prob_white_rep.R": fitsthe model for the probability of Rep. candidates being white based on pictures (appendix, seed: 1505016775)    
- "fit_us_senate1990_2022_front.R": fits the front runner model  (seed: 203476667)             
- "fit_us_senate1990_2022_gender.R": fits the gender model (seed:958408030)              
- "fit_us_senate1990_2022_inc.R": fits the incumbency model (seed:1909853667)                
- "fit_us_senate1990_2022_last.R": fits the margin last poll model (seed: 1552733845)               
- "fit_us_senate1990_2022_minority_face.R": fits the minority status model based on pictures (appendix, seed: 471621417)       
- "fit_us_senate1990_2022_minority_name.R": fits  the minority status model based on names (appendix, seed: 561261443)         
- "fit_us_senate1990_2022_minority.R"fits the minority model  (seed: 561261443)        
- "fit_us_senate1990_2022_name_prob_white_dem.R": fits the model for the probability of Dem. candidates being white based on names (appendix, seed:668102078)    
- "fit_us_senate1990_2022_name_prob_white_rep.R": fits the model for the probability of Rep. candidates being white based on names (appendix, seed:1695463951)    
- "fit_us_senate1990_2022_turnout.R": fits the turnout model (seed:560157829)
		
#### Stan models in the stan_ml folder:

- "ml_senate_bias_context_cat.stan": categorical predictors of election-day bias
- "ml_senate_bias_context_cat2.stan": categorical predictors of election-day bias, varying by party
- "ml_senate_bias_context_cont.stan": continuous predictors of election-day bias
- "ml_senate_context_empty.stan": empty model
- "ml_senate_var_context_cont.stan" : continuous predictors of excess variance

#### 2. Results and Visualization (results_vis folder)

This folder contains code to generate election-day bias and excess variance estimates, as well as figures for the paper’s main text and appendix.

#### Key scripts
- "us_senate_context_cf_dist_res.R": computes CFscore distance estimates                
- "us_senate_context_cf_score_avg_exp.R": computes CFscore Dem. and Rep. estimates, along with average expected values        
- "us_senate_context_control_res.R": computes State control estimates                     
- "us_senate_context_democracy_avg_exp.R": computes State Democracy Score estimates, along with average expected values
- "us_senate_context_democracy_res.R": computes State Democracy Score estimates                   
- "us_senate_context_empty_res.R": computes empty model estimates                        
- "us_senate_context_exp_res.R": computes per capita campaign expenditures estimates                         
- "us_senate_context_face_white_prob_dem_avg_exp.R": computes Dem. candidates' probability being white based on pictures estimates, along with average expected values 
- "us_senate_context_face_white_prob_rep_avg_exp.R": computes Rep. candidates' probability being white based on pictures estimates, along with average expected values 
- "us_senate_context_front_res.R": computes front runner estimates                      
- "us_senate_context_gender_res.R": computes gender estimates                     
- "us_senate_context_inc_res.R": computes incumbency estimates                        
- "us_senate_context_last_avg_exp.R": computes margin last poll average expected values              
- "us_senate_context_last_res.R": computes margin last poll estimates                             
- "us_senate_context_minority_face_res.R": computes minority status based on picture estimates                              
- "us_senate_context_minority_name_res.R": computes minority status based on name estimates                                        
- "us_senate_context_minority_res.R": computes minority estimates                
- "us_senate_context_name_white_prob_dem_avg_exp.R": computes Dem. candidates' probability being white based on name estimates, along with average expected values  
- "us_senate_context_name_white_prob_rep_avg_exp.R": computes Rep. candidates' probability being white based on name estimates, along with average expected values   
- "us_senate_context_turnout_res.R": computes turnout estimates
- "us_senate_vis_cf_score_dist.R": plots CFscore distance estimates for Dem. and Rep. candidates
- "us_senate_vis_comparison_30_100.R": comparison plots of election-day bias and excess variance for 30-day and 100-day models (appendix)
- "us_senate_vis_democracy_control.R": plots electoral conduct estimates
- "us_senate_vis_dte_n_poll.R": descriptive plots for the number of polls and time window (appendix)             
- "us_senate_vis_error_dist.R": plots observed TSE distributions and estimated election-day bias and standard deviation      
- "us_senate_vis_front_inc_last.R": plots front runner and incumbency estimates                 
- "us_senate_vis_minority_gender.R": plots minority status and gender estimates                
- "us_senate_vis_minority_name_face.R": plots estimates for minority based on pictures and names (appendix)             
- "us_senate_vis_turnout_exp.R": plots mobilization estimates
                          
 



## Software 
- R: version 4.3.1 
- Stan: version 2.26.1.

## Contact Information
Dr. Sina Chen  
Email: sina.chen@uni-konstanz.de

  
